Maximum Entropy Density Estimation with Incomplete Data

نویسندگان

  • Bert Huang
  • Ansaf Salleb-Aouissi
چکیده

We propose a natural generalization of Regularized Maximum Entropy Density Estimation (maxent) to handle input data with unknown values. While standard approaches to handling missing data usually involve estimating the actual unknown values, then using the estimated, complete data as input, our method avoids the two-step process and handles unknown values directly in the maximum entropy formulation. The maxent method was recently proposed as an excellent method of presence-only prediction [2, 3]. In a presence-only framework, we are given a set, X , of data in which some of the data are labeled as positive. However, unlike the typical classification framework, the remaining unlabeled instances are not necessarily negative. Instead, they are considered of unknown class. The regularized maxent method treats the positively labeled points as random draws from some hidden distribution overX and attempts to estimate that distribution. Specifically, regularized maxent tries to find a distribution over X with maximum entropy such that the expected values of each feature are close to the observed means of the features with a positive label. Let F be anN×D matrix of features such that Fij is the i’th datum’s j’th feature. Let vectorm be the means of the D features of the labeled positive data. Then the standard regularized maxent optimization is:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Maximum Entropy Density Estimation with Incomplete Presence-Only Data

We demonstrate a generalization of Maximum Entropy Density Estimation that elegantly handles incomplete presence-only data. We provide a formulation that is able to learn from known values of incomplete data without having to learn imputed values, which may be inaccurate. This saves the effort needed to perform accurate imputation while observing the principle of maximum entropy throughout the ...

متن کامل

Consistency and Generalization Bounds for Maximum Entropy Density Estimation

We investigate the statistical properties of maximum entropy density estimation, both for the complete data case and the incomplete data case. We show that under certain assumptions, the generalization error can be bounded in terms of the complexity of the underlying feature functions. This allows us to establish the universal consistency of maximum entropy density estimation.

متن کامل

Modeling of the Maximum Entropy Problem as an Optimal Control Problem and its Application to Pdf Estimation of Electricity Price

In this paper, the continuous optimal control theory is used to model and solve the maximum entropy problem for a continuous random variable. The maximum entropy principle provides a method to obtain least-biased probability density function (Pdf) estimation. In this paper, to find a closed form solution for the maximum entropy problem with any number of moment constraints, the entropy is consi...

متن کامل

Maximum Entropy Formalism and Genetic Algorithms

The maximum entropy principle [1 – 3] is a powerful tool in the investigations of image reconstruction, spectral analysis, seismic inversion, inverse scattering etc. It is proven to be the only consistent method for inferring from incomplete information. Here we show that the maximum entropy principle can be cast into a unconstrained optimization problem and therefore genetic algorithms [4, 5] ...

متن کامل

Quasi-continuous maximum entropy distribution approximation with kernel density

This paper extends maximum entropy estimation of discrete probability distributions to the continuous case. This transition leads to a nonparametric estimation of a probability density function, preserving the maximum entropy principle. Furthermore, the derived density estimate provides a minimum mean integrated square error. In a second step it is shown, how boundary conditions can be included...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007